Real-world utilization of SARS-CoV-2 serological testing in RNA positive patients across the United States

Background As diagnostic tests for COVID-19 were broadly deployed under Emergency Use Authorization, there emerged a need to understand the real-world utilization and performance of serological testing across the United States. Methods Six health systems contributed electronic health records and/or claims data, jointly developed a master protocol, and used it to execute the analysis in parallel. We used descriptive statistics to examine demographic, clinical, and geographic characteristics of serology testing among patients with RNA positive for SARS-CoV-2. Results Across datasets, we observed 930,669 individuals with positive RNA for SARS-CoV-2. Of these, 35,806 (4%) were serotested within 90 days; 15% of which occurred <14 days from the RNA positive test. The proportion of people with a history of cardiovascular disease, obesity, chronic lung, or kidney disease; or presenting with shortness of breath or pneumonia appeared higher among those serotested compared to those who were not. Even in a population of people with active infection, race/ethnicity data were largely missing (>30%) in some datasets—limiting our ability to examine differences in serological testing by race. In datasets where race/ethnicity information was available, we observed a greater distribution of White individuals among those serotested; however, the time between RNA and serology tests appeared shorter in Black compared to White individuals. Test manufacturer data was available in half of the datasets contributing to the analysis. Conclusion Our results inform the underlying context of serotesting during the first year of the COVID-19 pandemic and differences observed between claims and EHR data sources–a critical first step to understanding the real-world accuracy of serological tests. Incomplete reporting of race/ethnicity data and a limited ability to link test manufacturer data, lab results, and clinical data challenge the ability to assess the real-world performance of SARS-CoV-2 tests in different contexts and the overall U.S. response to current and future disease pandemics.


Introduction
Coronavirus disease 2019 (COVID- 19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); originally identified in Wuhan, China in December 2019 [1]. In January 2020, COVID-19 was declared a public health emergency in the United States as the disease continued to spread worldwide. As new variants continue to threaten health and well-being across the globe, valid serology tests are needed to support the characterization of immune response-overall and within different subpopulations-to identify effective treatments, prophylaxis, and mitigation strategies [2,3]. Given the public health emergency, currently authorized serologic assays to test for antibodies against SARS-CoV-2 have not undergone the same evidentiary review standards required for the Food and Drug Administration (FDA) approval [4,5]. A collaboration among the US National Cancer Institute, Centers for Disease Control and Prevention (CDC), Biomedical Advanced Research and Development Authority (BARDA), and the Food and Drug Administration (FDA) led to the development of a dataset to compare the performance characteristics of different serological tests that were independently evaluated using sample panels of patients who were positive and negative for SARS-CoV-2 antibodies [6]. However, as the sample size of the dataset is limited, more robust population-based studies on the accuracy of serology tests are needed to support assay selection and implementation, interpretation of seroepidemiologic studies, and estimates of COVID-19 prevalence and immune response [7]. Additionally, given disproportionate infection rates in communities of color [8] and asymptomatic spread and carriage of COVID-19 [9][10][11][12], understanding the best use of serologic tests to estimate the true prevalence of disease and immunity is critical to developing sound public health mitigation strategies that serve all communities.
A critical piece to enable the assessment of real-world performance is the ability to link manufacturer test information, lab results, and patient healthcare data. Despite several initiatives to improve interoperability of healthcare data, there are few incentives to create digital "bridges" enabling public health and research networks to leverage more complete data sets for rapid analysis and discovery [13]. The absence of unique device identifiers (UDIs) for clear and unambiguous identification of specific diagnostic tests; and the limited integration and flow of manufacturer assay information impedes the interpretation of seroepidemiologic studies and estimates of COVID- 19 prevalence.
An initial step to address this challenge is to identify which metadata can be captured and explore approaches to transmitting data between the instrument, laboratory information system (LIS), and electronic health record (EHR). Enabling such interoperability would likewise allow us to assess the real-world performance of serological tests and describe results in the context of clinical symptoms. Additionally, disproportionately high infection rates in underserved communities and asymptomatic carriage and spread of SARS-CoV-2 [9,11] underscore the need for reliable serologic test reporting to accurately estimate disease prevalence and to develop equitable public health mitigation strategies [14,15]. Recent studies by the Centers for Disease Control (CDC) describe SARS-CoV-2 seroprevalence across the U.S. from convenience samples retrieved from routine blood chemistry [16], and others describe the duration of antibody response [17][18][19][20]. However, to our knowledge, few studies characterize the realworld use of serological testing for COVID-19, particularly in the context of symptoms and race [21].
To address these gaps, the Reagan-Udall Foundation for the FDA, in collaboration with the FDA and Friends of Cancer Research. has convened the COVID-19 Evidence Accelerator (EA). The EA is a consortium of leading experts in health systems research, regulatory science, data science, and epidemiology, specifically assembled to analyze health system data to address key questions related to COVID- 19. The EA provides a platform for rapid learning and research using a common analytic plan. In May 2020, the EA launched the Diagnostics EA. As part of the Diagnostics EA, we examined patterns of COVID-19 serological testing using realworld data among the different populations and clinical characteristics. Specifically, our objectives were to 1) understand the current state of data interoperability across instrument, laboratory, and clinical data; 2) describe serological testing by demographic, environmental characteristics (e.g., geographic location), baseline clinical presentation, key comorbidities (e.g., diabetes and cardiovascular disease), and bacterial/viral co-infections (e.g., influenza), and 3) assess the timing of serology testing relative to molecular testing date by the characteristics listed above. Characterizing how serology tests were used (including which tests were used, when, and in whom), as well as potential gaps in data, provide an important context to interpret future results to describe diagnostic accuracy.

Study population and setting
A call to participate in this descriptive analysis was put out to the Evidence Accelerator (EA) community. Six health systems answered the call and collaborated on the Diagnostics EA: Aetion and HealthVerity, Health Catalyst, Mayo Clinic, OptumLabs, Regenstrief Institute, and the University of California Health System. Health Catalyst, Mayo Clinic, and the University of California Health System all utilized EHR data from their respective healthcare delivery systems, Regenstrief Institute accessed EHR clinical data from the Indiana health information exchange [22,23], while Aetion and OptumLabs utilized medical and pharmacy claims, as well as data directly from laboratories. Furthermore, Aetion drew hospital billing data from the HealthVerity Marketplace. OptumLabs utilized administrative claims data from a single, large, U.S. insurer. We refer to these health systems as partners A-F for the purposes of anonymity. Data sources included in the analysis are generally categorized as either payer (claims) or healthcare delivery systems. As illustrated in Fig 1, data were drawn from across the U.S. with heavy representation in California, Illinois, Ohio, and Michigan. Characteristics of participating data sources and representative populations are described in S1 Table.

Study design
Each partner analyzed data collected from their distinct sources according to a master protocol and identified patients across settings (e.g., inpatient, outpatient, or long-term care facility) who tested positive for SARS-CoV-2 ribonucleic acid (RNA) by molecular test between March-September 2020, except one partner who went through April 30, 2021 (Fig 2). "Date of RNA positive" served as the index (cohort entry) date and was defined hierarchically as either the date at 1) sample collection; 2) accession; or 3) result. Among datasets that included primarily claims data, our protocol excluded persons who did not have evidence of enrollment for at least six months in the year before the index to decrease bias in the capture of pre-existing conditions. We did not implement similar data requirements from healthcare delivery systems and health information exchanges (HIEs), given the lack of membership data. We identified comorbidities (pre-existing conditions) 365 days before the index date.
Follow up for serological testing, excluding immunoglobulin M tests, went through 90 days after the index date in all but one partner who identified all RNA positive and serology tests through April 30, 2021 without additional follow-up time for serology. Multiple serological measures were captured. Among those who received a serological test, we described the prevalence of presenting symptoms; concomitant infections with influenza and respiratory syncytial virus; time (in days) to the first serological test; and the number of serological and molecular tests in the 90 days after index.
To minimize the effect of differential missingness between partners, we did the following: 1) included all persons with an office or telephone visit in the +/-14 days around the index date to enable as complete an assessment of presenting symptoms as possible; 2) in claim systems, included only persons with at least six months of enrollment in the year before index; 3) estimated the proportion of patients at each site who had zero encounters in the prior year to contextualize our capture of pre-existing conditions; and excluded variables from analysis if  �30% of values were missing. Between 35-65% of patients identified from health care delivery systems had no documented encounter in the system in the 365 to 15 days before the index date. In contrast, only 11% of patients from national insurers reported having zero claims in the baseline period. We also assessed the distribution of age, sex, and geography in those with and without data on serology manufacturers. We did not observe any difference by age or sex in those with known versus unknown serology manufacturer information. In a single partner reporting <30% missing race/ethnicity, we observed over-representation of White and Hispanic individuals in those with known serology manufacturer data.

Measures
Demographic and environmental characteristics, baseline clinical presentation, key comorbidities, bacterial/viral co-infections, and test characteristics potentially related to serological testing were included in the analysis (S1 Fig). We identified comorbidities and clinical presentation using phenotypes defined by ICD-10, and/or National Drug Codes. We provided coding algorithms used for other EA studies and from FDA's Sentinel Initiative for partners to use, while some partners used existing algorithms generated within their systems. The ICD-10 codes used to identify comorbidities are listed in S2 Table. Given differences in data availability across partners, each partner identified which of the prescribed covariates could be included in their analyses.

Manufacturer data
We interviewed diagnostic manufacturers, clinical laboratory directors, middleware and information technology vendors, and clients to understand the data generated by the instrument and the data flow from the instrument to information systems for laboratory and clinical data.

Statistical analysis
Descriptive analyses were performed separately by each contributing data partner in accordance with a common analytic plan. Among persons with and without serology, we calculated the distribution by age, sex, race, ethnicity, U.S. region, pre-existing medical conditions (including but not limited to cardiovascular disease, hypertension, kidney disease, asthma, dementia, chronic liver disease, etc.), smoking status, and obesity. We also analyzed body mass index (BMI), pregnancy status, presenting symptoms, and RNA test manufacturer. Among those with at least one serology test after index, we described the frequency of presenting symptoms and the specific manufacturer/assays at the time of the first serology test, and the time to the first test. We calculated the median and interquartile range (IQR) for the number of days between RNA and the first test. Separately, we included all serology and RNA tests after the index date to describe the median and IQR for the number of molecular and serological tests conducted after the index date.
The WCG Institutional Review Board (IRB), the IRB of record for the Reagan-Udall Foundation for the FDA, reviewed the study and determined it to be non-human subjects research.

Results
Study samples ranged from 36,319-363,653 individuals per data set-a total of 930,669 people with a confirmed SARS-CoV-2 infection by molecular test across all partners contributing data from March 1-September 30, 2020; and a sixth partner who captured data through April 30, 2020. As described in Table 1, the study population across all datasets was predominantly female, White, and 45-64 years of age. The geographic distribution of patients included in the analyses represented the population in each of the health systems, with two national datasets drawing primarily from the Mid-Atlantic region. Among two datasets, a majority of the sample population had no evidence of pre-existing conditions, whereas in two nationally representative samples, 30-50% had evidence of such. The most prevalent pre-existing conditions across healthcare partners were diabetes, hypertension, cardiovascular disease, obesity, and lung conditions. Across all healthcare partners, 4-11% of the female population were pregnant in the 40 weeks before the index date. The most common presenting symptoms at index were cough, shortness of breath, and pneumonia. The prevalence of lab-confirmed concomitant respiratory syncytial virus or influenza was <1%.

Serological testing (serotesting)
Generally, 3-6% of those with confirmed infection were serotested-a total of 35,806 people observed.across all datasets. Nearly all follow-up serological tests were immunoglobulin G (IgG) tests (Table 2). Generally, each partner utilized one or two primary serology tests and did not support a large number of tests.  Serology manufacturer and test name were captured by four analytic partners, and mostly complete (<30% missing) for three included in this analysis (A, C, E). One of our largest partners was missing manufacturer data in 85% of the sample, and two partners were missing it completely. While manufacturer and assay name, as well as other metadata, are typically captured and available for export from the instrument, oftentimes laboratory information systems are not configured to receive or store this information. Constraints on integration include technical limitations of software and middleware, as well as a lack of clinical need, business case, or regulatory incentive. Capturing, storing, and transferring this additional data would require a substantial investment of resources to modify and/or reconfigure existing instruments, laboratory information systems, connective middleware, and EHRs. Absent a regulatory or reimbursement requirement, companies perceive little need to invest such resources given other competing priorities.

Serotesting by demographic characteristics
Overall, we observed a higher distribution of persons aged 45-64 among those serotested compared to those not serotested. Four partners representing healthcare delivery systems reported race with <30% missing. Across three of these partners, we observed a higher distribution of White individuals among those serotested compared to those not. We did not observe a consistent pattern in serotesting by sex.
Five partners had representation across more than one region of the U.S. In partners with national representation, patients from the West North Central (Iowa, Nebraska, Kansas, North Dakota, Minnesota, South Dakota, Missouri) and West South Central (Arkansas, Louisiana,  1 At the time of RNA or serological sample refers to +/-14 days from the sample collection date for the relevant test 2 The unaccounted samples in Partners A and B were missing. 3 Data was not available 4 Hispanic ethnicity was not mutually exclusive from race. 5 Phenotypes (code-sets) of ICD-10, medication, and LOINC are provided in S2 Table. Conditions may be identified using ICD-10, medication, or both. 6 Pre-existing conditions were assessed 365 days before the index date and were not mutually exclusive. 7 Pregnancy Status was assessed up to 40 weeks before the index date.  9 The unaccounted samples in Partner A were missing. 10 The FDA issued guidance for clinical laboratories, commercial manufacturers, and FDA staff on the use of diagnostic and serological tests for COVID-19 on May 16, 2020. https://www.fda.gov/news-events/fda-voices/insight-fdas-revised-policy-antibody-tests-prioritizing-access-and-accuracy. 11 No pre-existing conditions-defined as those identified to have none of the above listed preexisting conditions. 12 Because some partners did not collect and report some variables, care should be taken when interpreting the total number of each variable. https://doi.org/10.1371/journal.pone.0281365.t001

PLOS ONE
Real-world SARS-CoV-2 serological testing Oklahoma, Texas) regions were under-represented among the serotested. Two partners operated primarily in a single U.S. state and thus did not allow assessment of geographic differences.

Serotesting by care-setting, symptoms, and pre-existing conditions
Half of the partners reported care-setting. Generally, most of the population was seen in the outpatient setting for their index visit. Large national insurer data did not suggest any differences in the distribution of index visit care settings among serotested vs. non-serotested. However, EHR data from a large national health data consortium revealed a higher distribution of patients in the inpatient setting among the serotested compared to non-serotested (13% vs. 2%, respectively).
As shown in Table 3, four of six partners reported presenting symptoms at index. Patterns in serotesting by symptoms seem to align with the data source. In partners who relied on  5 We refer to the tests as Γ-Y for the purposes of anonymity. Most tests received an Emergency Use Authorization from the FDA. References available upon request 6 The sum for Partner E's manufacturer-serological test name is classified as unknown/missing. 7 The sum between the molecular test sample type for Partner F includes all people that have a positive RNA test result. https://doi.org/10.1371/journal.pone.0281365.t002

PLOS ONE
Real-world SARS-CoV-2 serological testing claims data, we generally see no systematic trend in serotesting by presenting symptoms at the time of the index visit. Among systems that relied on EHR data, we see a higher distribution of patients with shortness of breath (15-20%), pneumonia (15-37%), and cardiovascular conditions (29-35%) among the serotested vs. non-serotested (10-15%, 10-16%, 17%, respectively). All but one data partner reported pre-existing conditions. We found individuals with preexisting cardiovascular disease tended to have greater representation in the serotested (35%-48%) vs. non-serotested group (17%-40%). In partners with EHR data, a greater distribution of patients with pre-existing obesity and kidney disease were also observed among the serotested compared to non-serotested. We did not observe a differential in testing among pregnant women-although only half of the contributing partners reported pregnancy status. We observed similar patterns of pregnancy among women with serological testing (4-13%) compared with women without serological testing (2-11%), with a slightly higher range in prevalence of pregnancy among women with serological testing.
As shown in Table 3 and Fig 3, many of the same symptoms at the time of RNA testing persisted at the time of serotesting, which may be attributed to the high volume of same-day molecular and serological testing.

Frequency and time to serological testing
In all but one healthcare system, serological testing increased substantially after May 1, 2020 (Table 1). Serological testing among those with positive RNA ranged from 3-6% across our contributing partners. Among all people with follow-up samples, 15% had a follow-up serology within 14 days of the index RNA test (Fig 3).
Overall, the median time to serotesting from RNA per data partner ranged from 10-31 days and was shorter in datasets from systems with data from EHRs (Table 4). In terms of age, adults 85 years and older tended to have the shortest time to follow-up between molecular and serology testing (median range: 1-25 days). In partners with robust capture of race and ethnicity, Black patients (median: 7-15 days) tended to experience a shorter time to serotesting as compared to White individuals (median: 13-21 days). In half of the analytic datasets, time to serotesting tended to be shortest in people with a history of dementia (median: 2-15 days). Within and across datasets, there was substantial variability in time to serotesting by presenting symptoms at index. In the two partners reporting on pregnancy, time to serotesting did not tend to differ by pregnancy status.
In general, we did not observe repeat molecular or serological testing within the 90-day time frame. In partners A-E, the median (IQR) number of both tests was 1 (0); while in partner F it was 1 (1). Time to serotesting tended to be shorter for IgG tests as compared to total antibody. There was substantial variation in time to serological testing across manufacturer assays (both molecular and serological). We observed differences in time to serological testing across care settings in only one dataset, with the median time to serotesting being 0 in the inpatient setting and almost one month in the outpatient. Patients with index dates after May 1 st , 2020 tended to wait fewer days for serological testing (median: 7-27) compared to those with index before May 1 st , 2020 (median: [28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43]. This difference may be explained by the lower availability of SARS-CoV-2 tests before May 1 since serology tests were not authorized before April 15, 2020; and molecular tests were not authorized before March 15, 2020.

Discussion
The Centers for Disease Control has initiated several large-scale population-based seroprevalence studies throughout the U.S. [24]. We conducted this study to characterize the real-world use of COVID-19 serological testing. We identified a number of key findings: 1) a substantial proportion of serology tests were conducted within 14 days of the RNA test, the majority of which occurred on the same day as the positive RNA test; 2) a lack of data interoperability between the instrument, laboratory, and clinical data could limit the ability to conduct a largescale assessment of the real-world performance of not only COVID-19 tests, but other diagnostic and laboratory tests; 3) missing race/ethnicity data may impede a comprehensive understanding of racial disparities involved in COVID-19 serology and immunity, and 4) important differences in the testing landscape presented from claims vs. EHR data sources may impact results generated from these data sources.
We assumed the date of a positive SARS-CoV-2 molecular test would be a reasonable proxy for symptom onset. We did not expect that 15% of serotesting would occur within 14    days of the RNA test, and most often on the same day. This is an important finding because we would not expect concordance between molecular and serology tests taken in close proximity because of known viral kinetics [25][26][27] After consulting with our analytic partners, we discovered the implementation of policies within health systems to screen patients admitted for procedures for active or past SARS-CoV-2 to evaluate the risk of nosocomial infections. These policies may be driving observed differences in the median time between molecular and serology tests in claims (31 days), compared to EHR datasets (10-24 days), with the nuance being washed out in larger claims datasets that incorporate a mix of care settings. Clinicians may also be serotesting because they do not believe that patients are presenting close to the time of exposure, desire a better understanding of patients' disease progression, or to assist in determining clinical course of care, which may depend on whether patients are at increased risk for severe illness due to insufficient antibody response [28]. For all diagnostic and serological tests authorized by the FDA, the FDA produces fact sheets for healthcare providers to provide information about the assay and its limitations [29]. Continued guidance and communication are needed to help clinicians understand how to best use serological tests for SARS-CoV-2 [30,31]. A higher distribution of patients presenting with respiratory, metabolic, and cardiovascular symptoms among the serotested compared to non-serotested, is consistent with an evaluation by the CDC that indicated such factors are associated with severe COVID-19 illness [32]. Patients with a pre-existing history of cardiovascular disease (including hypertension) and liver disease were over-represented among those serotested vs. those not serotested in multiple datasets. These conditions have been shown to be associated with excess risk in other studies  [33,34]. It was surprising that we did not observe any differences in the distribution of cancer in those serotested compared to the non-serotested. More research is needed to understand why some patients with known active SARS-CoV-2 infection receive a serology test, while others do not. Across care delivery systems, a notable observation was increased serological testing in White as compared to Black individuals. However, when Black patients did receive serology testing, the time to testing was shorter, which may be due to a pressing need to identify the presence of antibodies/past infection in populations who have been shown to be at higher risk of COVID-19 morbidity and mortality [17]. More importantly, data on race from a large national insurer was missing in about 80% of the sample. Without data on race and ethnicity, the racial disparities in COVID-19 outcomes-and healthcare in general-cannot be addressed.
Another important information gap is in manufacturer data. Despite targeted conversations with technology teams and experts in technical, syntactic, and semantic interoperability, only half of analytic partners were able to integrate test manufacturer data with LIS and EHR data. A lack of data interoperability within healthcare is a historic problem [35]. Such interoperability is the foundation for public health surveillance, research, artificial intelligence, medical advances, and quality assurance in the context of EUA [36,37]. Healthcare systems, manufacturers, and information technology vendors should move to fill information gaps to improve response to COVID-19 and future public health threats.

Differences in results reported by claims vs. EHR-based systems
Analytic partners ran their analyses in parallel and aligned on a common analytic plan. We did not pool data, which allowed us to highlight, rather than control for differences across partners. Different patterns between EHR and claims systems were apparent in our analysis. In general, claims datasets showed no difference in serotesting by care setting or presenting symptoms, whereas EHR systems did. And while all datasets showed an elevated prevalence of pre-existing cardiovascular disease observed among those serotested (compared to the nonserotested), EHR datasets also showed a greater distribution of people with pre-existing obesity, kidney disease, and chronic lung conditions among the serotested. Because healthcare delivery systems generally have a limited ability to capture all clinical events for a given patient [38], sicker patients may be driving identification within certain health systems and pre-existing conditions may have been missed in patients who do not regularly attend the facility for care but were diverted to the facility [38]. Our data support this hypothesis on both points of increased illness among patients and lower identification of pre-existing conditions among patients identified from EHR compared to claims data sources. These differences may influence the interpretation of serology tests [38][39][40][41][42].

Strengths and limitations
Our study has many strengths. This was a large assessment of serotesting across the U.S. in diverse datasets leveraging either EHR or claims data. We developed a protocol that incorporated the unique characteristics of each data source and provided a forum to transparently communicate and collaborate on study design and interpretation. We also established a platform to rapidly collect and analyze data from various systems to evaluate process improvement and identify important trends over time. Such a platform may be used to evaluate process improvement and comparisons within data systems.
Our study also has some important limitations. First, we were unable to assess the independence of samples across the healthcare partners directly. Three partners provide national coverage, and thus large sample sizes. The geographic distribution of their populations does not suggest overlap. However, single health systems included in the same geographic region as the larger healthcare partners (specifically in the Pacific and Mountain regions) may be double counted. Second, smoking status, BMI, and race were largely missing in our analysis. These are important characteristics in assessing the impact of COVID-19 on the health of the population. Third, the sample collection date was not always available the and result date was used by some partners. As such, it is possible that samples collected on the same day may have different result dates if tests were run sequentially. Fourth, manufacturer information was largely missing from two of our largest datasets because instrument data either did not flow to the laboratory information system (LIS), or those results were not transmitted from the LIS to the EHR or payer database. However, we did not find differential missingness by age, sex, or geography among individuals with and without manufacturer data. Finally, lack of data on COVID-19 exposure and symptom onset limits our ability to make future inferences on appropriate pairs of molecular and serological tests to assess serological performance for past infection. We note that assumptions regarding the proximity of RNA testing to symptom onset may not be reliable over time. Testing for active infection has gone from severely limited at the start of the pandemic (March-April 2020) to widely available today. People may receive serial RNA testing without suspected exposure for purposes of employment or recreational gathering with friends and family.
As in all observational datasets, the completeness of our assessment is dependent on the capture of events in each of our healthcare data partners. Indeed, we observed that a greater proportion (35-65%) of patients identified in EHR data had no encounter in the year prior to index, compared to 11% among those identified from payer data. Coupled with our observation from EHRs that there seemed to be a greater number of pre-existing conditions for which there was preferential serotesting, these data provide additional evidence that patients identified through EHR data sources may tend to be sicker than those identified in claims. Furthermore, not knowing "care setting" for a large portion of tests could affect interpretation of the performance of serology testing as well, since the sensitivity of serology assays appears to be lower in mildly sick and/or asymptomatic cohorts.

Conclusion
Our results inform the underlying context of serotesting during the first year of the COVID-19 pandemic and differences in serotesting trends observed from claims and EHR data sources-a critical first step to understanding the real-world accuracy of serological tests. The limited ability to link test manufacturer data with lab results and clinical data, and incomplete reporting of race/ethnicity data challenge the ability to assess real-world performance of SARS-CoV-2 tests in different populations and settings. These shortcomings challenge the overall U.S. response to current and future disease pandemics.
Supporting information S1